Performance Analysis of Optical Character Recognition Using Adaptive Binarization of Degraded Document Images

نویسندگان

  • M. Manimaraboopathy
  • M. Anto Bennet
  • A. Priya
  • S. Vijayalakshmi
  • D. Hemavathy
چکیده

The planned OCR formula to retrieve the text within the scanned document pictures. Here the text detection formula supported two machine learning classifiers: one permits generating candidate word regions and therefore the different filters out non-text ones. The extract connected elements (CCs) in pictures by victimization the maximally stable extremal region formula. In CC cluster adaboost classifiers are accustomed confirm whether or not the region contains text or not. Then victimization binarization methodology, the grey image is converted into binary image. The binarization outcomes are subject to OCR and therefore the corresponding results evaluated with relevance character and word accuracy. As additional and additional text documents are scanned quick and correct. Extra performance metrics of the proportion rates of broken and uncomprehensible text, false alarms, ground noise, character enlargement and merging. This effectiveness of the planned methodology is additionally confirmed by tests carried on realistic document pictures. For planned formula MATLAB version thirteen package is employed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Binarization of Document Image

Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, t...

متن کامل

Optical Character Recognition from Degraded Document Images

Segmentation of the text from badly degraded document images is very challenging tasks due to the high inter/intra variation between the document background and the foreground text of different types of document images. In this paper, a novel document image binarization technique is used to addresses the issues in the degraded document images by using adaptive image contrast. The adaptive image...

متن کامل

Degraded Document Image Binarization Using Optical Character Recognition

The proposed OCR algorithm to retrieve the text in the scanned document images. Here the text detection algorithm based on two machine learning classifiers: one allows generating candidate word regions and the other filters out non-text ones. The extract connected components (CCs) in images by using the maximally stable extremal region algorithm. In CC clustering adaboost classifiers are used t...

متن کامل

A Quad Tree Based Binarization Approach to Improve quality of Degraded Document Images

This paper proposes a novel binarization algorithm for converting the grayscale and color images into black and white images. The binarization is one of the very important process in all the researches pertaining to the field of the Document image processing and Pattern recognition. Since quality of binary image plays a critical role in the further processing of the document, especially in the ...

متن کامل

Adaptive Binarization of Unconstrained Hand-Held Camera-Captured Document Images

This paper presents a new adaptive binarization technique for degraded hand-held camera-captured document images. The state-of-the-art locally adaptive binarization methods are sensitive to the values of free parameter. This problem is more critical when binarizing degraded camera-captured document images because of distortions like non-uniform illumination, bad shading, blurring, smearing and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017